Performance Analysis of the Cilk Locality Runtime

نویسنده

  • Ramsay Shuck
چکیده

Many parallel programming platforms, such as Cilk Plus, implement a work-stealing scheduler to load-balance parallel computations, since it provides a provably good execution time bound and performs well in practice. However, most work-stealing schedulers, including the Cilk Plus scheduler, do not utilize shared caches and memory efficiently. The Cilk Locality runtime was developed to address the poor locality in the Cilk Plus runtime with the aim of improving overall performance. In this paper, the performance of the Cilk Locality runtime is compared to the Cilk Plus runtime using a number of different metrics. The results show that the Cilk Locality runtime achieves a modest improvement over the Cilk Plus runtime due to improved locality

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Programming Parallel Applications in

Cilk (pronounced \silk") is a C-based language for multithreaded parallel programming. Cilk makes it easy to program irregular parallel applications, especially as compared with data-parallel or message-passing programming systems. A Cilk programmer need not worry about protocols and load balancing, which are handled by Cilk's provably eecient runtime system. Many regular and irregular Cilk app...

متن کامل

Performance Benchmarking Locality Aware Runtime for NUMA Architecture

Non-Uniform Memory Access architectures introduce a new level of difficulty to programmers. Without the knowledge of the underlying runtime the performance of programs can suffer because they are unaware of the difference in memory latencies in NUMA systems. We seek to alleviate this issue by implementing a runtime that schedules computations on individual NUMA nodes, hopefully countering this ...

متن کامل

Cost Model: Work, Span and Parallelism

In this class, we will overview various parallel programming technology developed. One of the technology we will examine closely is Cilk, a C/C++ based concurrency platform. Before we overview the development and implementation of Cilk, we shall first overview a brief history of Cilk technology to account for where the major concepts originate. Cilk technology has developed and evolved over mor...

متن کامل

SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters

Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping the shared memory operations efficient are critical issues. Distributed Cilk (Cilk 5.1) is a multithreaded runtime system for SMP clusters with the support of divide-and-conquer programming paradigm. However, there is n...

متن کامل

Memory-mapping support for reducer hyperobjects Citation

Reducer hyperobjects (reducers) provide a linguistic abstraction for dynamic multithreading that allows different branches of a parallel program to maintain coordinated local views of the same nonlocal variable. In this paper, we investigate how thread-local memory mapping (TLMM) can be used to improve the performance of reducers. Existing concurrency platforms that support reducer hyperobjects...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017